Segmentation of Speech and Humming in Vocal Input

نویسندگان

  • Adam J. SPORKA
  • Ondřej POLÁČEK
  • Jan HAVLÍK
چکیده

Non-verbal vocal interaction (NVVI) is an interaction method in which sounds other than speech produced by a human are used, such as humming. NVVI complements traditional speech recognition systems with continuous control. In order to combine the two approaches (e.g. “volume up, mmm”) it is necessary to perform a speech/NVVI segmentation of the input sound signal. This paper presents two novel methods of speech and humming segmentation. The first method is based on classification of MFCC and RMS parameters using a neural network (MFCC method), while the other method computes volume changes in the signal (IAC method). The two methods are compared using a corpus collected from 13 speakers. The results indicate that the MFCC method outperforms IAC in terms of accuracy, precision, and recall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Music Retrieval System with a Seamless Query Interface by Humming or Song Title

We propose a music retrieval system that enables a user to retrieve a song by two different methods: by singing its melody or by saying its title. To allow the user to use those methods seamlessly without changing a voice input mode, a method of automatically discriminating between singing and speaking voices is indispensable. We therefore designed an automatic vocal style discriminator and bui...

متن کامل

Effective Segmentation based on Vocal Effort Change Point Detection

Non-neutral speech data has a strong negative impact on speech processing systems such as Automatic Speech Recognition (ASR) or speaker ID systems [1]. It is therefore necessary to detect and segment non-neutral speech data before further processing steps. Alternatively, the detection and segmentation of non-neutral speech segments from an input speech stream can be used in speech analysis and ...

متن کامل

A Romanian Syllable-Based Text-To-Speech System

In this article we present the way we have built a syllable-based TTS system for Romanian. The system contains: a text analyser capable to separate syllables from input text and detect accentuation, a vocal database with recorded syllables, a unit matching module and a synthesizer. The analyser was built using a LEX generator by mean of two sets of phonetic rules. Vocal database was generated t...

متن کامل

Mirex2008: Query by Humming/singing System

This extended abstract describes my submission to the QBSH (Query by Singing/Humming) task of MIREX (Music Information Retrieval Evaluation eXchange) 2008. The system takes advantage of note-based and frame-based matching methods to improve the accuracy of the Query by Singing/Humming system. First, Earth Mover’s Distance (EMD), which is note-based and much faster, is adopted to eliminate most ...

متن کامل

Patient-Based Assessment of Effectiveness of Voice Therapy in Vocal Mass Lesions with Secondary Muscle Tension Dysphonia

Introduction: Use of patient-based voice assessment scales is an appropriate method that is frequently used to demonstrate effectiveness of voice therapy. This study was aimed at determining the effectiveness ofvoice therapy among patients with secondary muscle tension dysphonia (MTD) and vocal mass lesions.   Materials and Methods: The study design was prospective, with within-participant repe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012